Contrast Pattern Mining with Gap Constraints for Peptide Folding Prediction

نویسندگان

  • Chinar C. Shah
  • Xingquan Zhu
  • Taghi M. Khoshgoftaar
  • Justin Beyer
چکیده

1 In this paper, we propose a peptide folding prediction method which discovers contrast patterns to differentiate and predict peptide folding classes. A contrast pattern is defined as a set of sequentially associated amino acids which frequently appear in one type of folding but significantly infrequent in other folding classes. Our hypothesis is that each type of peptide folding has its unique interaction patterns among peptide residues (amino acids). The role of contrast patterns is to act as signatures or features for prediction of a peptide’s folding type. For this purpose, we propose a two phase peptide folding prediction framework, where the first stage is to discover contrast patterns from different types of contrast datasets, followed by a learning process which uses all discovered patterns as features to build a supervised classifier for folding prediction. Experimental results on two benchmark protein datasets will indicate that the proposed framework can outperform simple secondary structure prediction based approaches for peptide folding prediction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Gap-Constraints Given Sequential Frequent Pattern Mining for Protein Function Prediction

OBJECTIVES Predicting protein function from the protein-protein interaction network is challenging due to its complexity and huge scale of protein interaction process along with inconsistent pattern. Previously proposed methods such as neighbor counting, network analysis, and graph pattern mining has predicted functions by calculating the rules and probability of patterns inside network. Althou...

متن کامل

Efficiently Mining Closed Subsequences with Gap Constraints

Mining frequent subsequence patterns from sequence databases is a typical data mining problem and various efficient sequential pattern mining algorithms have been proposed. In many problem domains (e.g, biology), the frequent subsequences confined by the predefined gap requirements are more meaningful than the general sequential patterns. In this paper we re-examine the closed sequential patter...

متن کامل

Generalization of Pattern-Growth Methods for Sequential Pattern Mining with Gap Constraints

The problem of sequential pattern mining is one of the several that has deserved particular attention on the general area of data mining. Despite the important developments in the last years, the best algorithm in the area (PrefixSpan) does not deal with gap constraints and consequently doesn't allow for the introduction of background knowledge into the process. In this paper we present the gen...

متن کامل

cSPADE -UE: Algorithm for Sequence Mining for Unstructured Elements Using Time Gap Constraints

-We present a new state machine that combines two techniques for complex data sequences: Data modeling and frequent sequence mining. This algorithm relies on unstructured variable gap sequence miner, to mine frequent patterns with different gap between elements. Here we will have two variations: Sequence pruning technique for other primary frequent sequences to reduce space complexity and allow...

متن کامل

NOSEP: Nonoverlapping Sequence Pattern Mining With Gap Constraints.

Sequence pattern mining aims to discover frequent subsequences as patterns in a single sequence or a sequence database. By combining gap constraints (or flexible wildcards), users can specify special characteristics of the patterns and discover meaningful subsequences suitable for their own application domains, such as finding gene transcription sites from DNA sequences or discovering patterns ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008